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The results of quantitative analysis of word distribution in two fables in Ukrainian by Ivan Pranko: 
"Mykyta the Fox"and " Abu-Kasym's slippers"are reported. Our study consists of two parts: the 
analysis of frequency-rank distributions and the application of complex networks theory. The analysis 
of frequency-rank distributions shows that the text sizes are enough to observe statistical properties. 
The power-law character of these distributions (Zipf 's law) holds in the region of rank variable 
r = 20 3000 with an exponent a ~ 1. This substantiates the choice of the above texts to analyse 
typical properties of the language complex network on their basis. Besides, an applicability of the 
Simon model to describe non-asymptotic properties of word distributions is evaluated. 

In describing language as a complex network, usually the words are associated with nodes, whereas 
one may give different meanings to the network links. This results in different network representa- 
tions. In the second part of the paper, we give different representations of the language network and 
perform comparative analysis of their characteristics. Our results demonstrate that the language 
network of Ukrainian is a strongly correlated scale-free small world. Empirical data obtained may 
be useful for theoretical description of language evolution. 
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This is an illustrative material from the paper submitted in Ukrainian to the Journal of Physical 
Studies (http:/ /www.ktf .franko.lviv.ua/JPS/index.html). 
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Table I: Rank classification of words from Ivan Pranko's "Mykyta the Fox" [l| (left part of the table) and "Abu-Kasym's 
slippers" [2| (right part of the table), r: rank, /: number of occurences of a word in the text. The length of the above texts 
equals M = 15426; 8002, their vocabulary (number of distinct words) equals V — 3563; 2392 for "Mykyta the Fox" and 
"Abu-Kasym's slippers" correspondingly. 
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Fig. 1: Frequency-rank dependence for "Mykyta the Fox". Solid curve: approximation by the power function /(r) ~ 1/r" 
with a = 1.00. Similar dependencies result for "Abu-Kasym's slippers" and for both texts joined together (with the values of 
a = 0.97 and a = 1.00, correspondingly). Typical accuracy of a is /d-o.f = 0.002. 




Fig. 2: Results of comparison of computer-generated and original texts. The curves show dependencies of the percent of 
coincidence of two texts as a function of predicted word block Upr. o: comparison of the original text with the text generated 
according to the Simon model. A: comparison the text generated according to the Simon model with the randomly generated 
text. □: comparison of two texts generated according to the Simon model. 
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Fig. 3: Representation of two sentences, 1: "IlepniHM BHitniOB Bobk HecHTHit", 2: "Bobk m'schbo xan - i flpajia!" in a form 
of graphs, a. L-space. Links connect neighbouring words, that belong to the same sentence. A number of neighbours for each 
word (word window) is defined by the "radious of interaction" 1 < i? < i?max. In the given example R — 1. For iZ = 1 
only the neighbouring words in a sentence are connected, for R = 2 links connect nearest and next nearest neighbours in a 
sentence, R = -Rmax corresponds to the sentence length. 6. B-space. Nodes of two sorts are present. Dark nodes: sentences, 
light nodes: words that belong to them. b. P-space. All words, that belong to the same sentence are connected, r. C-space. 
Sentences are connected if they contain the same words. A link between nodes-sentences (1 and 2) corresponds to the word 
"bobk", common for both sentences. For the nomenclature of the above spaces see Ref. [401. ^or the language networks, the 
L-space representation was introduced in Ref. and the P-space representation was introduced in Ref. [l5l |. Note that at 
R = 7?max L-space representation coincides with the P-space representation. 
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Table II: Quantitative characteristics of word networks for the texts under consideration in L-space for several values of R. 
Upper part of the table: word network for the text "Abu-Kasym's slippers"|2], middle part: "Mykyta the Fox"[l|, lower part: 
both texts together. V: number of nodes, Ai: number of links, (k), fcmax: mean and maximal node degree, 7, 7int: exponents of 
the node degree {P{k) ~ k~'') and of the cumulative node degree (Pint(A:) = Ylk'^=k ^ik')) distributions, (C): mean value of 
the clustering coeflEicient, Cr'. clustering coefficient of the classical Erdos-Renyi random graph of the same size, (/), Zmax: mean 
and maximal values of the shortest path length. Note, that at R = Ruiax representation of a network in L- and P-spaces do 
coincide. 
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Fig. 4: "Mykyta the Fox": dependence of word network characteristics on the size of a word window R. Mean ({A,*}, o-o-o) 
and maximal (femax, □-□-□) node degrees, mean clustering coefficient {{C*), A-A-A) and shortest path length {{I*), V-V-V), 
cumulative node degree distribution exponent (7int, ^-'\)-'\)) are normalized by their values at Ji = i?max- An increase of R 
causes an increase of number of links in the network. This is the reason for an increase of (C), {k), {kmax) and for a decrease 
of {/) with R. 




Fig. 5: Node degree distribution for "Mykyta the Fox" follows a power law P(k) ~ 1/k'' for different R (R = 1 (o-o-o), 
R = i?max ( A-A-A)). A difference between the exponents of the power law for different R can not be distinguished for the 
texts under consideration. Solid line shows a power law with an exponent 7 = 1.9. 
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Fig. 6: Cumulative node degree distribution for "Mykyta the Fox" also shows a power law dependence. Within an accuracy of 
the plot one can see an increase of the exponent 7int with R. 7int = 1-12 for R = 1 (o-o-o), 7 = 1.27 for R = iimax (A-A-A). 




Fig. 7: "Mykyta the Fox": mean shortest path length from the node of degree k to the rest of network nodes. R = I (0-0-0), 
R = Rmax (A-A-A). Decrease of (1) with k indicates that hubs (most connected nodes) are in a closer reach from each other 
than other nodes. Very small value of (l) indicates that this network is a small world. 
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Fig. 8: Mean clustering coefEcient as a function of the node degree for "Mykyta the Fox". R = I (o-o-o), R = R^^ax (A-A- 
A). These dependencies are characterized by a plateau at small values of k and further decreasing. An increase of the mean 
clustering coefficient with R is explained by an increase of the number of links with R at constant number of nodes. For the 
similar reason, (C) increases with the text length. 
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